Where Should Saliency Models Look Next?

نویسندگان

Zoya Bylinskii

Adrià Recasens

Ali Borji

Aude Oliva

Antonio Torralba

Frédo Durand

چکیده

Recently, large breakthroughs have been observed in saliency modeling. The top scores on saliency benchmarks have become dominated by neural network models of saliency, and some evaluation scores have begun to saturate. Large jumps in performance relative to previous models can be found across datasets, image types, and evaluation metrics. Have saliency models begun to converge on human performance? In this paper, we re-examine the current state-of-the-art using a finegrained analysis on image types, individual images, and image regions. Using experiments to gather annotations for high-density regions of human eye fixations on images in two established saliency datasets, MIT300 and CAT2000, we quantify up to 60% of the remaining errors of saliency models. We argue that to continue to approach human-level performance, saliency models will need to discover higher-level concepts in images: text, objects of gaze and action, locations of motion, and expected locations of people in images. Moreover, they will need to reason about the relative importance of image regions, such as focusing on the most important person in the room or the most informative sign on the road. More accurately tracking performance will require finer-grained evaluations and metrics. Pushing performance further will require higher-level image understanding.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressed-Sampling-Based Image Saliency Detection in the Wavelet Domain

When watching natural scenes, an overwhelming amount of information is delivered to the Human Visual System (HVS). The optic nerve is estimated to receive around 108 bits of information a second. This large amount of information can’t be processed right away through our neural system. Visual attention mechanism enables HVS to spend neural resources efficiently, only on the selected parts of the...

متن کامل

CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research

Saliency modeling has been an active research area in computer vision for about two decades. Existing state of the art models perform very well in predicting where people look in natural scenes. There is, however, the risk that these models may have been overfitting themselves to available small scale biased datasets, thus trapping the progress in a local minimum. To gain a deeper insight regar...

متن کامل

Learning to predict where to look in interactive environments using deep recurrent q-learning

Bottom-Up (BU) saliency models do not perform well in complex interactive environments where humans are actively engaged in tasks (e.g., sandwich making and playing the video games). In this paper, we leverage Reinforcement Learning (RL) to highlight task-relevant locations of input frames. We propose a soft attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL agent h...

متن کامل

SUPPLEMENTAL MATERIAL : Where should saliency models look next ?

In Fig. 1 we include the performances of the top four neural network models evaluated on the MIT300 dataset (as of March 2016), along with the top three non neural network models, and three traditional bottom-up approaches that are commonly used for saliency comparisons. The metrics reported are ones evaluated on the MIT Saliency Benchmark [1], supplemented with information gain (as recommended...

متن کامل

What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition.

Saliency map models account for a small but significant amount of the variance in where people fixate, but evaluating these models with natural stimuli has led to mixed results. In the present study, the eye movements of participants were recorded while they viewed color photographs of natural scenes in preparation for a memory test (encoding) and when recognizing them later. These eye movement...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Where Should Saliency Models Look Next?

نویسندگان

چکیده

منابع مشابه

Compressed-Sampling-Based Image Saliency Detection in the Wavelet Domain

CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research

Learning to predict where to look in interactive environments using deep recurrent q-learning

SUPPLEMENTAL MATERIAL : Where should saliency models look next ?

What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition.

عنوان ژورنال:

اشتراک گذاری